Bleu: a Method for Automatic Evaluation of Machine Translation

نویسندگان

Kishore Papineni

Salim Roukos

Todd Ward

Wei-Jing Zhu

چکیده

Human evaluations of machine translation are extensive but expensive. Human evaluations can take months to finish and involve human labor that can not be reused. We propose a method of automatic machine translation evaluation that is quick, inexpensive, and languageindependent, that correlates highly with human evaluation, and that has little marginal cost per run. We present this method as an automated understudy to skilled human judges which substitutes for them when there is need for quick or frequent evaluations.1

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Measuring Confidence Intervals for the Machine Translation Evaluation Metrics

Automatic evaluation metrics for Machine Translation (MT) systems, such as BLEU and the related NIST metric, are becoming increasingly important in MT. This paper reports a novel method of calculating the confidence intervals for BLEU/NIST scores using bootstrapping. With this method, we can determine whether two MT systems are significantly different from each other. We study the effect of tes...

متن کامل

Automatic Evaluation for a Palpable Measure of a Speech Translation System's Capability

The main goal of this paper is to propose automatic schemes for the translation paired comparison method. This method was proposed to precisely evaluate a speech translation system's capability. Furthermore, the method gives an objective evaluation result, i.e., a score of the Test of English for International Communication (TOEIC). The TOEIC score is used as a measure of one's speech translati...

متن کامل

The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language

Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...

متن کامل

Feedback Cleaning of Machine Translation Rules Using Automatic Evaluation

When rules of transfer-based machine translation (MT) are automatically acquired from bilingual corpora, incorrect/redundant rules are generated due to acquisition errors or translation variety in the corpora. As a new countermeasure to this problem, we propose a feedback cleaning method using automatic evaluation of MT quality, which removes incorrect/redundant rules as a way to increase the e...

متن کامل

Interpreting BLEU/NIST Scores: How Much Improvement do We Need to Have a Better System?

Automatic evaluation metrics for Machine Translation (MT) systems, such as BLEU and the related NIST metric, are becoming increasingly important in MT. Yet, their behaviors are not fully understood. In this paper, we analyze some flaws in the BLEU/NIST metrics. With a better understanding of these problems, we can better interpret the reported BLEU/NIST scores. In addition, this paper reports a...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2002

Bleu: a Method for Automatic Evaluation of Machine Translation

نویسندگان

چکیده

منابع مشابه

Measuring Confidence Intervals for the Machine Translation Evaluation Metrics

Automatic Evaluation for a Palpable Measure of a Speech Translation System's Capability

The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language

Feedback Cleaning of Machine Translation Rules Using Automatic Evaluation

Interpreting BLEU/NIST Scores: How Much Improvement do We Need to Have a Better System?

عنوان ژورنال:

اشتراک گذاری